Search CORE

25 research outputs found

The Benefit of Multitask Representation Learning

Author: Maurer Andreas
Pontil Massimiliano
Romera-Paredes Bernardino
Publication venue
Publication date: 25/03/2016
Field of study

We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established. In particular, focusing on the important example of half-space learning, we derive the regime in which multitask representation learning is beneficial over independent task learning, as a function of the sample size, the number of tasks and the intrinsic data dimensionality. Other potential applications of our results include multitask feature learning in reproducing kernel Hilbert spaces and multilayer, deep networks.Comment: To appear in Journal of Machine Learning Research (JMLR). 31 page

arXiv.org e-Print Archive

UCL Discovery

Object segmentation in depth maps with one user click and a synthetically trained fully convolutional network

Author: A Rozantsev
Bernardino Romera-Paredes
C. Lawrence Zitnick
Nathan Silberman
P Arbeláez
Pedro O. Pinheiro
Tsung-Yi Lin
Publication venue
Publication date: 06/11/2017
Field of study

With more and more household objects built on planned obsolescence and consumed by a fast-growing population, hazardous waste recycling has become a critical challenge. Given the large variability of household waste, current recycling platforms mostly rely on human operators to analyze the scene, typically composed of many object instances piled up in bulk. Helping them by robotizing the unitary extraction is a key challenge to speed up this tedious process. Whereas supervised deep learning has proven very efficient for such object-level scene understanding, e.g., generic object detection and segmentation in everyday scenes, it however requires large sets of per-pixel labeled images, that are hardly available for numerous application contexts, including industrial robotics. We thus propose a step towards a practical interactive application for generating an object-oriented robotic grasp, requiring as inputs only one depth map of the scene and one user click on the next object to extract. More precisely, we address in this paper the middle issue of object seg-mentation in top views of piles of bulk objects given a pixel location, namely seed, provided interactively by a human operator. We propose a twofold framework for generating edge-driven instance segments. First, we repurpose a state-of-the-art fully convolutional object contour detector for seed-based instance segmentation by introducing the notion of edge-mask duality with a novel patch-free and contour-oriented loss function. Second, we train one model using only synthetic scenes, instead of manually labeled training data. Our experimental results show that considering edge-mask duality for training an encoder-decoder network, as we suggest, outperforms a state-of-the-art patch-based network in the present application context.Comment: This is a pre-print of an article published in Human Friendly Robotics, 10th International Workshop, Springer Proceedings in Advanced Robotics, vol 7. The final authenticated version is available online at: https://doi.org/10.1007/978-3-319-89327-3\_16, Springer Proceedings in Advanced Robotics, Siciliano Bruno, Khatib Oussama, In press, Human Friendly Robotics, 10th International Workshop,

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Conditional Random Fields as Recurrent Neural Networks

Author: Du Dalong
Huang Chang
Jayasumana Sadeep
Romera-Paredes Bernardino
Su Zhizhong
Torr Philip H. S.
Vineet Vibhav
Zheng Shuai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/04/2016
Field of study

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.Comment: This paper is published in IEEE ICCV 201

arXiv.org e-Print Archive

Crossref

Making CNNs for Video Parsing Accessible

Author: Fulda Nancy
Furlanello Tommaso
Guzdial Matthew
Guzdial Matthew
Han Song
Luo Zijin
Makarovych Sasha
McCallum Andrew
Romera-Paredes Bernardino
Shaker Noor
Summerville Adam
Yannakakis Georgios N
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/06/2019
Field of study

The ability to extract sequences of game events for high-resolution e-sport games has traditionally required access to the game's engine. This serves as a barrier to groups who don't possess this access. It is possible to apply deep learning to derive these logs from gameplay video, but it requires computational power that serves as an additional barrier. These groups would benefit from access to these logs, such as small e-sport tournament organizers who could better visualize gameplay to inform both audience and commentators. In this paper we present a combined solution to reduce the required computational resources and time to apply a convolutional neural network (CNN) to extract events from e-sport gameplay videos. This solution consists of techniques to train a CNN faster and methods to execute predictions more quickly. This expands the types of machines capable of training and running these models, which in turn extends access to extracting game logs with this approach. We evaluate the approaches in the domain of DOTA2, one of the most popular e-sports. Our results demonstrate our approach outperforms standard backpropagation baselines.Comment: 11 pages, 6 figures, Foundations of Digital Games 201

arXiv.org e-Print Archive

Crossref

Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns

Author: Anind
Callister William D.
Cho Youngjun
Cho Youngjun
Hendriks Antonius
Holone H.
Jaderberg Max
Krizhevsky Alex
Kurz Daniel
Leonard John J.
Liu C.
Liu H.
Lloyd J. M.
Marks
Nair Vinod
Ren Shaoqing
Romera-Paredes Bernardino
Victor
Vollmer Michael
Zhu Xiangxin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/03/2018
Field of study

We introduce Deep Thermal Imaging, a new approach for close-range automatic recognition of materials to enhance the understanding of people and ubiquitous technologies of their proximal environment. Our approach uses a low-cost mobile thermal camera integrated into a smartphone to capture thermal textures. A deep neural network classifies these textures into material types. This approach works effectively without the need for ambient light sources or direct contact with materials. Furthermore, the use of a deep learning network removes the need to handcraft the set of features for different materials. We evaluated the performance of the system by training it to recognise 32 material types in both indoor and outdoor environments. Our approach produced recognition accuracies above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584 images of 17 outdoor materials. We conclude by discussing its potentials for real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing System

arXiv.org e-Print Archive

Crossref

UCL Discovery